Improving Application Performance by Dynamically Trading Frequency for Complexity in a GALS Microprocessor∗
نویسندگان
چکیده
Microprocessors are traditionally designed to provide “best overall” performance across a wide range of applications and operating environments. Several groups have proposed hardware techniques that save energy by “downsizing” hardware resources that are underutilized by the current application. We explore the converse: improving performance by “upsizing” resources for which the application has greater needs. Our proposal depends critically on the ability to change frequencies independently in separate domains of a globally asynchronous, locally synchronous (GALS) microprocessor. We use a variant of a multiple clock domain (MCD) processor, with four independently clocked domains. Each domain is streamlined with modest hardware structures for very high clock frequency. Key structures can then be upsized on demand to exploit more distant parallelism, improve branch prediction, or increase cache capacity. Although doing so requires decreasing the associated domain frequency, other domain frequencies are unaffected. Measuring across a broad suite of application benchmarks, we find that configuring our MCD processor just once per application yields performance 17.6% better, on average, than that of the “best overall” fully synchronous design. By adapting automatically to application phases, we can increase this advantage to more than 20%. ∗This work was supported in part by NSF grants CCR-9701915, CCR-9811929, CCR-9988361, EIA-0080124, and CCR-0204344; by DARPA/ITO under AFRL contract F29601-00-K-0182; by and IBM Faculty Partnership Award; and by equipment grants from IBM and Intel. Authors’ current addresses: Greg Semeraro, Department of Computer Engineering, Rochester Institute of Technology, Rochester, NY 14623, [email protected]; David Albonesi, Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY 14627, [email protected]; Steven Dropsho, EPFL–IC–LABOS, CH 1015 Lausanne, Switzerland, [email protected]; Grigorios Magklis, Intel Labs Barcelona, Jordi Girona 29-3A, 08034 Barcelona, Spain, [email protected]; Michael Scott, Department of Computer Science, University of Rochester, Rochester, NY 14627, [email protected].
منابع مشابه
Improving Application Performance by Dynamically Balancing Speed and Complexity in a GALS Microprocessor
Microprocessors are traditionally designed to provide “best overall” performance across a wide range of applications and operating environments. Several groups have proposed hardware techniques that save energy by “downsizing” hardware resources that are underutilized by particular applications. We explore the converse: “upsizing” hardware resources in order to improve performance relative to a...
متن کاملAn Architectural and Circuit-Level Approach to Improving the Energy Efficiency of Microprocessor Memory Structures
We present a combined architectural and circuit technique for reducing the energy dissipation of microprocessor memory structures. This approach exploits the subarray partitioning of high speed memories and varying application requirements to dynamically disable partitions during appropriate execution periods. When applied to 4-way set associative caches, trading off a 2% performance degradatio...
متن کاملReclaiming Performance and Energy Efficiency from Variability
This paper presents methods for addressing two sources of variability in the context of microprocessors within-die process variability and dynamic thermal variability and shows the improvements in performance and energy efficiency obtained by applying them to a globally asynchronous, locally synchronous (GALS) microprocessor design. The GALS design style partitions the core into several indepen...
متن کاملImproving risk-adjusted performance in high-frequency trading: the role of fuzzy logic systems
In recent years, algorithmic and high-frequency trading have been the subject of increasing risk concerns. A general theme that we adopt in this thesis is that trading practitioners are predominantly interested in risk-adjusted performance. Likewise, regulators are demanding stricter risk controls. First, we scrutinise conventional AI model design approaches with the aim to increase the risk-ad...
متن کاملImproving Power Efficiency in Stream Processors Through Dynamic Reconfiguration
Stream processors support hundreds of functional units in a programmable architecture by clustering those units and utilizing a bandwidth hierarchy. This paper presents mechanisms to enable dynamic reconfiguration of the number of clusters in a stream processor to match the time varying data parallelism of an application. Many embedded applications go through several phases of execution, marked...
متن کامل